Guidelines for annotating the LUNA corpus with frame information
نویسندگان
چکیده
منابع مشابه
PAYMA: A Tagged Corpus of Persian Named Entities
The goal in the named entity recognition task is to classify proper nouns of a piece of text into classes such as person, location, and organization. Named entity recognition is an important preprocessing step in many natural language processing tasks such as question-answering and summarization. Although many research studies have been conducted in this area in English and the state-of-the-art...
متن کاملGuidelines for Annotating Temporal Information
This paper introduces a set of guidelines for annotating time expressions with a canonicalized representation of the times they refer to. Applications that can benefit from such an annotated corpus include information extraction (e.g., normalizing temporal references for database entry), question answering (answering “when” questions), summarization (temporally ordering information), machine tr...
متن کاملBalancing the Existing and the New in the Context of Annotating Non-Canonical Language
The importance of balancing linguistic considerations, annotation practicalities, and end user needs in developing language annotation guidelines is discussed. Maintaining a clear view of the various goals and fostering collaboration and feedback across levels of annotation and between corpus creators and corpus users is helpful in determining this balance. Annotating non-canonical language bri...
متن کاملDomain-related Annotation of Polish Spoken Dialogue Corpus LUNA.PL
In this paper we present a corpus of Polish spoken dialogues annotated on several levels, from transcription of dialogues and their morphosyntactic analysis, to semantic annotation. The corpus is one of the results of LUNA project. The description is concentrated on the semantic annotation on the levels of concepts (attribute-value) and predicates (frame sets).
متن کاملAn Overview of the CRAFT Concept Annotation Guidelines
We present our concept-annotation guidelines for an large multi-institutional effort to create a gold-standard manually annotated corpus of full-text biomedical journal articles. We are semantically annotating these documents with the full term sets of eight large biomedical ontologies and controlled terminologies ranging from approximately 1,000 to millions of terms, and, using these guideline...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010